-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ViViT(Video Vision Transformer) to KerasCV #2335
base: master
Are you sure you want to change the base?
Conversation
Thanks for the PR @aditya02shah, can you please add a colab demo to verify the results and also share the weights file with us. How does this compare to HF implementation? |
@divyashreepathihalli This implementation closely aligns to the one used in keras-examples. |
@aditya02shah what expected here is the outputs of your model should match with the outputs in the hf implementation in last layer . Am i right @divyashreepathihalli ?? |
FYI, Official implementation: https://github.com/google-research/scenic/tree/aaeaa203bfbbaf3d2c6d9865fe86d1379cfe4a58/scenic/projects/vivit |
Thanks Adithya!! If the outputs match the example that is good enough. But I would like to see a colab demo that uses the changes from your PR. |
@divyashreepathihalli I've created a Colab demo that incorporates the changes from my pull request. You can access it here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @aditya02shah. I have left a few cleanup comments. Also, lets make sure the tests pass.
self.patch_size = patch_size | ||
|
||
def build(self, input_shape): | ||
self.projection = keras.layers.Conv3D( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define all layers in init and build them here like self.layer_name.build(expected_input_shape)
@divyashreepathihalli I have made the recommended changes. You can find the colab for the latest commit here |
Thanks @aditya02shah!! PS: we will fix this overhead soon, but in the mean time this is what we need to do. |
@divyashreepathihalli No worries, I have updated the build script! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just one NIT regarding the build method.
@divyashreepathihalli I have made revisions to the build method. Colab for the latest changes. |
Thanks for the update @aditya02shah! there is one error that needs to be fixed
|
What does this PR do?
Adding ViViT model
Overview:
This PR integrates the ViViT model into KerasCV along with the inclusion of relevant test cases
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
@divyashreepathihalli